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Abstract 

We consider a slotted-ALOHA LAN with loss-averse, noncooperative greedy users. To avoid non-Pareto equilibria, 
particularly deadlock, we assume probabilistic loss-averse behavior. This behavior is modeled as a modulated white 
noise term, in addition to the greedy term, creating a diffusion process modeling the game. We observe that when 
player's modulate with their throughput, a more efficient exploration of play-space (by Gibbs sampling) results, and 
so finding a Pareto equilibrium is more likely over a given interval of time. 



I. Introduction 

The "by rule" window flow control mechanisms of, e.g., TCP and CSMA, have elements of both proactive and 
reactive communal congestion control suitable for distributed/information-limited high-speed networking scenarios. 
Over the past ten years, game theoretic models for medium access and flow control have been extensively explored 
in order to consider the effects of even a single end-user/player who greedily departs from such prescribed/standard 
behaviors JT], (6j, (9), fl3)-fl6), p2)-p4), |27|. Greedy end-users may have a dramatic effect on the overall 



"fairness" of the communication network under consideration. So, if even one end-user acts in a greedy way, it may 
be prudent for all of them to do so. However, even end-users with an noncooperative disposition may temporarily 
not practice greedy behavior in order to escape from sub-optimal (non-Pareto) Nash equilibria. In more general 
game theoretic contexts, the reluctance of an end-user to act in a non-greedy fashion is called loss aversion |7J. 
In this note, we focus on simple slotted-ALOHA MAC for a LAN. We begin with a noncooperative model of 



end-user behavior. Despite the presence of a stable interior Nash equilibrium, this system was shown in [T3| , 1 14 1 
to have a large domain of attraction to deadlock where all players' transmission probability is one and so obviously 
all players' throughput is zero (here assuming feasible demands and throughput based costs). To avoid non-Pareto 
Nash equilibria, particularly those involving zero throughput for some or all users, we assume that end-users will 
probabilistically engage in non-greedy behavior. That is, a stochastic model of loss aversion, a behavior whose aim 
is long term communal betterment. 
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We may be able to model a play that reduces net-utility using a single "temperature" parameter T in the manner 
of simulated annealing {e.g., [I2j|); i.e., plays that increase net utility are always accepted and plays that reduce 
net utility are (sometimes) accepted with probability decreasing in T, so the players are (collectively) less loss 
averse with larger T. Though our model of probabilistic loss aversion is related that of simulated annealing by 
diffusions flO) , |28) , even with a free meta-parameter (77 or rjw below) possibly interpretable as temperature, our 
modeling aim is not centralized annealing (temperature cooling) rather decentralized exploration of play-space by 
noncooperative users. 

We herein do not model how the end-users will keep track of the best (Pareto) equilibria previously played/discoverecQ 
Because the global extrema of the global objective functions (Gibbs exponents) we derive do not necessarily corre- 
spond to Pareto equilibria, we do not advocate collective slow "cooling" (annealing) of the equivalent temperature 
parameters. Also, we do not model how end-user throughput demands may be time-varying, a scenario which would 
motivate the "continual search" aspect of the following framework. 

The following stochastic approach to distributed play-space search is also related to "aspiration" of repeated 
games J3}, (8), fT8[ , where a play resulting in suboptimal utility may be accepted when the utility is less than 



a threshold, say according to a "mutation" probability |T7|, (25). This type of "bounded rational" behavior been 



proposed to find Pareto equilibria, in particular for distributed settings where players act with limited information 
(25). Clearly, given a global objective L whose global maxima correspond to Pareto equilibria, these ideas are 
similar to the use of simulated annealing to find the global maxima of L while avoiding suboptimal local maxima. 
This paper is organized as follows. In Section [II] we formulate the basic ALOHA noncooperative game under 



consideration. Our stochastic framework (a diffusion) for loss aversion is given in Section III for two different 



modulating terms of the white-noise process, the invariant distribution in the collective play-space is derived. A 



two-player numerical example is used to illustrate the performance of these two approaches in Section IV We 
conclude in Section [V] with a discussion of future work. 

II. A DISTRIBUTED SLOTTED- ALOHA GAME FOR LAN MAC 

Consider an idealizecj^] ALOHA LAN where each user/player i g {1,2, ...,n} has (potentially different) trans- 
mission probability v t . For the collective "play" v — (vi, v 2 , v n ), the net utility of player i is 

Vi(v) = Ui(0i(v)) - M6i(v), (1) 

where the strictly convex and increasing utility U; L of steady-state throughput 

Oi := ,\\\ , ; ; 

'The players could, e.g., alternate between (loss averse) greedy behavior to discover Nash equilibrium points, and the play dynamics modeled 
herein for breadth of search (to escape non-Pareto equilibria). 

2 We herein do not consider physical layer channel phenomena such as shadowing and fading as in, e.g., | Iftj . j^J. 



February 29, 2012 



DRAFT 



3 



is such that Ui(0) = 0, and the throughput-based price is M. So, the throughput-demand of the i th player is 

This is a quasi-stationary game wherein future action is based on the outcome of the current collective play v 
observed in steady-state (5). 



The corresponding continuous Jacobi iteration of the better response dynamics is [13], 1 14 1, |26|: for all i 



^t Vi = rCT =: ~ Eii2Ll (2) 



Vi 



cf. |6]). Note that we define — Ei, instead of Ei, to be consistent with the notation of (28), which seeks to minimize 
a global objective, though we want to maximize such objectives in the following. 

Such dynamics generally exhibit multiple Nash equilibria, including non-Pareto equilibria with significant domains 
of attraction. Our ALOHA context has a stable deadlock equilibrium point where all players always transmit, 



i.e.,v=l:= (1, 1, 1) 113), |14|. 



III. A DIFFUSION MODEL OF LOSS AVERSION 

Generally in the following, we consider differently loss-averse players. Both examples considered are arguably 
distributed (information limited) games wherein every player's choice of transmission probability is based on 
information knowable to them only through their channel observations, so that consultation among users is not 
required. In particular, players are not directly aware of each other's demands (y). 

A. Model overview 

We now model stochastic perturbation of the Jacobi dynamics Q, allowing for suboptimal plays despite loss 
aversion, together with a sigmoid mapping g to ensure plays (transmission probabilities) v remain in a feasible 
hyperrectangle D c [0, 1]" (i.e., the feasible play-space for v): for all i, 

du t = -Ei(v)dt + o-i(vi)dWi (3) 

Vi = 9i{ui) (4) 

where Wi are independent standard Brownian motions. An example sigmoid is 

g(u) := 7(tanh(u/w) + 8), (5) 

where 1 < 8 < 2 and < 7 < 1/(1 + 8). Thus, m{ u g(u) = iniv — ~f(—l + 8) > and sup u g(u) = supu = 
7(1 + 8) < 1. Again, to escape from the domains of attraction of non-Pareto equilibria, the deterministic Jacobi 
dynamics (i.e., —Ei(v)dt in (|3|) have been perturbed by white noise (dW^) here modulated by a diffusion term of 
the form: 

aM) = V W 
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where 

Mvi) ■.= g'M 1 ^)). 

For the example sigmoid |5]), 

'<-> ■ ^-oht 

In the following, we will consider different functions hi leading to Gibbs invariant distributions for v. 
Note that the discrete-time (k) version of this game model would be 

Ui {k+l)-u(k) = -Ei(v(k))e + ai(v(k))Ni(k) 

Vi(k + 1) = + (6) 

where the Ni(k) are all i.i.d. normal N(0,e) random variables. 

The system just described is a variation of E. Wong's diffusion machine f28) , the difference being the introduction 
of the term h instead of a temperature meta-parameter T. Also, the diffusion function <7j is player-i dependent at 
least through hi. Finally, under the slotted-ALOHA dynamics, there is no function E(v) such that dE/dvi = Ei, 
so we will select the diffusion factors hi to achieve a tractable Gibbs stationary distribution of v, and interpret them 
in terms of player loss aversion. 

Note that in the diffusion machine, a common temperature parameter T may be slowly reduced to zero to find 
the minimum of a global potential function (the exponent of the Gibbs stationary distribution of v) pO) , pT) , in 
the manner of simulated annealing. Again, the effective temperature parameter here (77 or rjw) will be constant. 

B. Example diffusion term hi decreasing in Vi 

In this subsection, we analyze the model when, for all i, 

hi(vi) := ru/i(l - Vi) 2 . (7) 

with 77 > a free meta-parameter (assumed common to all players). So, a greedier player i (larger yi) will generally 
tend to be less loss averse (larger hi), except when their current retransmission play V{ is large. 



Theorem 3.1: The stationary probability density function of v e D c [0, 1]™, defined by (j4j) and (gj, is 

1 fA(v 



p(v) = lexpf^- log H(v)), (8) 
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where: the normalizing term 

n 

D ■= l[(li(-l + Si), 7i(l + *i)) 
t=l 

am - n^-E i^^d-,) )n* 

n 

H(y) := n^ 1 "^) 2 ' and 

3=1 
n 

3=1 



Remark: A is a Lyapunov function of the deterministic (<7j = for all i) Jacobi iteration fl3) , fl4| . 



Proo/: Applying Ito's lemma (19], |28| to Q and Q gives 

dvi = g'i(ui)dui + -g"(ui)(T? (v)dt 

= [-MviWv) + l -g>>(gi l (v t ))o 2 t {v)]&t 
+ fi{vi)a t (v)dW 2 , 

where the derivative operator z' := ^p-z(vi) and we have just substituted (3 1 for the second equality. From the 



Fokker-Planck (Kolmogorov forward) equation for this diffusion p9) , [28 1, we get the following equation for the 
time-invariant (stationary) distribution p of v: for all i, 

where the operator di := 
Now note that 

fi(vi)a?(v) = 2hi{vi)fi{vi) and 

gKa^MWfivi) = h^g'Ugi 1 ^))/!^) 

= hi{vi)f'i{vi). 

So, the previous display reduces to 

= diihif^-i-EJi + hifDp 
= (hidip + h'iP + E^fi, 
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where the second equality is due to cancellation of the hify terms. For all i, since fi > 0, 

p(v) hi{Vi) hi(vi) 

1 ;d t A(v) ~ 



TjY 1 — Vi 

Finally, <j8j follows by direct integration. □ 

Unfortunately, the exponent of p under (FT), 

A(v) := ^-\ og H(v), 
and both its component terms A and — log H , remain maximal in the deadlock region near 1. 

C. Example diffusion term hi increasing in Vi 

The following alternative diffusion term hi is an example which is instead increasing in u,-, but decreasing in the 
channel idle time from player i's point-of-view (2j, fTT) , 

hi(v) := = r -^* (10) 

That a user would be less loss averse (higher h) when the channel was perceived to be more idle may be a reflection 
of a "dynamic" altruism [2] (i.e., a player is more courteous as s/he perceives that others are). The particular form 
of ( [TO} also leads to another tractable Gibbs distribution for v. 



Theorem 3.2: Using ( 10 1, the stationary probability density function of the diffusion v on [0, 2j] n is 



p(v) = ^exp(Afe)) (11) 



where 



and W is the normalizing term. 



Proof: Following the proof of Theorem 3.1 the invariant here satisfies also satisfies ([£]): 



di log 



Ei(v) 



Substituting ( 10 1 gives: 



di \ogp{v) 



hi(v 
---EH 



- dilog hi(v) 



Vi r\ 
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So, we obtain (12i by direct integration. □ 



Note that if rj > max^ yi, then A is strictly decreasing in Vi for all i, and so will be minimal in the deadlock region 
(unlike A). So the stationary probability in the region of deadlock will be low. However, large r\ may result in the 
stationary probability close to being very high. So, we see that the meta-parameter ij (or rjw) here plays a more 
significant role (though the parameters 5 and 7 in g play a more significant role in the former objective A owing 
to its global extrema at 1). 

IV. Numerical examples 

A. Using 

For an n — 2 player example with demands y = (8/15, 1/15) and 77 = 1, the two interior Nash equilibria are the 
locally stable (under deterministic dynamics) at u* = (2/3, 1/5) and the (unstable) saddle point at — (4/5, 1/3) 
(both with corresponding throughputs 9 = y) p3j , |T4j . Again, 1 is a stable deadlock boundary equilibrium which 
is naturally to be avoided if possible as both players' throughputs are zero there, 6 — 0. Under the deterministic 
dynamics of Q, the deadlock equilibrium 1 had a significant domain of attraction including a neighborhood of the 
saddle point v_l- 

The exponent of p, A, for this example is depicted in Figure [l] A has a similar shape as that the Lyapunov 
function A, but without interior local extrema or saddle points. The extreme mode at 1 is clearly evident. 



100' 




0.90.S0.7(1.60 S0-40.30.20-l 



Fig. 1. The Gibbs distribution |8j for n = 2 players with demands y = (8/15, 1/15) under Q 

When we took 7^, Si in |5]l so that 0.05 < v, L — g(ui) < 0.85 for all players i, the stationary probability of the 
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"good" region containing the two interior Nash equilibria is 

P(v€ [0.65,0.82] x [0.18,0.35]) « 0.18, (13) 
as computed using dHl. This probability does not appreciably improve by varying rj from 1 (both A and log H diverge 



at 1), though it does dramatically decrease as swp u g(u) f 1, e.g., if we take s\xp u g(u) = 0.9 then (13 1 decreases 
to about 10 -54 (essentially zero, of course) which is consistent with Figure [l] Also, the largest contribution of this 
probability is the region around the saddle point which is part of the deadlock domain of attraction of 1, the global 
maximum of A. 

To reiterate, the fundamental advantage of stochastic loss aversion is seen by comparing the large domain of 
attraction of the deadlock equilibrium of the deterministic dynamics (14) , with positive probability of presence near 
the interior Nash equilibrium points where both players' demands y are satisfied. This advantage is born out more 
clearly in the following example. 

B. Using HU\ 



For the same two-player example with r/ = 4.5/15 = (yi +y2)/2, and svip u g(u) — 0.9, the probability in (13 1 
was 0.05, compared to essentially zero for the same parameter range for |7}. The exponent of p, A, for this example 
is depicted in Figure |T| A is dissimilar to the function A (or A or H) without significant modes in the range of 
rj close to the demands y, and a much smaller overall range of values (and likewise for the stationary density p 
of y). Though the performance under ( 10 1 was more sensitive to r\ (temperature) than under using ( 10 1 clearly 



resulted in more effective searching of the play-space D and was far less sensitive to the parameters 5, 7 defining 
it. 

V. Conclusions and Future Work 



The diffusion term (lOi was clearly more effective than (|7]) at exploring the play-space, and in so doing, was 



dramatically less sensitive to the choice of the parameters 5 and 7 governing the range of the play-space D. 

In future work, we plan to explore other diffusion factors h (numerically if they do not lead to a Gibbs stationary 
distribution p) with a goal to reduce the stationary probability that v occupies the boundary regions. Also, we will 
consider a model with power based costs, i.e., Mv instead of M0 in the net utility ([TJ. Finally, we will study the 
effects of asynchronous and/or multirate play among the users (2), (4], ff5) . 
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